Data decisiveness, data quality, and incongruence in phylogenetic analysis: an example from the monocotyledons using mitochondrial atp A sequences

Syst Biol. 1998 Jun;47(2):282-310. doi: 10.1080/106351598260923.

Abstract

We examined three parallel data sets with respect to qualities relevant to phylogenetic analysis of 20 exemplar monocotyledons and related dicotyledons. The three data sets represent restriction-site variation in the inverted repeat region of the chloroplast genome, and nucleotide sequence variation in the chloroplast-encoded gene rbcL and in the mitochondrion-encoded gene atpA, the latter of which encodes the alpha-subunit of mitochondrial ATP synthase. The plant mitochondrial genome has been little used in plant systematics, in part because nucleotide sequence evolution in enzyme-encoding genes of this genome is relatively slow. The three data sets were examined in separate and combined analyses, with a focus on patterns of congruence, homoplasy, and data decisiveness. Data decisiveness (described by P. Goloboff) is a measure of robustness of support for most parsimonious trees by a data set in terms of the degree to which those trees are shorter than the average length of all possible trees. Because indecisive data sets require relatively fewer additional steps than decisive ones to be optimized on nonparsimonious trees, they will have a lesser tendency to be incongruent with other data sets. One consequence of this relationship between decisiveness and character incongruence is that if incongruence is used as a criterion of noncombinability, decisive data sets, which provide robust support for relationships, are more likely to be assessed as noncombinable with other data sets than are indecisive data sets, which provide weak support for relationships. For the sampling of taxa in this study, the atpA data set has about half as many cladistically informative nucleotides as the rbcL data set per site examined, and is less homoplastic and more decisive. The rbcL data set, which is the least decisive of the three, exhibits the lowest levels of character incongruence. Whatever the molecular evolutionary cause of this phenomenon, it seems likely that the poorer performance of rbcL than atpA, in terms of data decisiveness, is due to both its higher overall level of homoplasy and the fact that it is performing especially poorly at nonsynonymous sites.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adenosine Triphosphatases / chemistry
  • Adenosine Triphosphatases / genetics*
  • Base Sequence
  • DNA Primers
  • Magnoliopsida / classification
  • Magnoliopsida / genetics*
  • Mitochondria / genetics*
  • Phylogeny*
  • Reproducibility of Results
  • Restriction Mapping

Substances

  • DNA Primers
  • Adenosine Triphosphatases